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Abstract. For 1 < p < oo and a weight w e A p and a function in L p ([0, 1], w) we show that 
variational sums with sufficiently large exponents of its Walsh-Fourier series are bounded in 
L p (w). This strengthens a result of Hunt -Young and is a weighted extension of a variation 
norm Carleson theorem of Oberlin-Seeger-Tao-Thiele- Wright. The proof uses phase plane 
analysis and a weighted extension of a variational inequality of Lepinglc. 



I. Introduction 

Let / be a measurable function on [0, 1]. The Walsh-Fourier series sum of / given by 

£</,W fc )W fc (a;) , 

fc>0 

is a dyadic analogue of the Fourier series. We shall recall the definition of the Walsh system 
of functions {Wk)k>o m Section 2. It is standard that boundedness in LP of the maximal 
Walsh-Fourier sum 

Sf(x):=sup\(S n f)(x)\ S n f(x):= ^</,^)V^(x) , 

n 0<k<n 

leads to a.e. convergence of the Walsh-Fourier series of functions in LP. For 1 < p < oo, this 
result holds, and is the Carleson theorem [2] on the pointwise convergence of Fourier series. 
Also see Hunt [6], for 1 < p < 2, and Sjolin [22] for the Walsh case. 

We are concerned with weighted estimates. For 1 < p < oo recall that a positive a.e. weight 
w is in A p if the following bound holds uniformly over (dyadic) intervals: 



1 



sup 



w(x)dx 



1 



w{x)- 1/{p ~ 1) dx 



p-i 



< oo . 



In this paper we prove the following theorems. Below, 5^/ is assumed for n < 0. 
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Theorem 1.1. Let 1 < p < oo and w G A p . Then there is an R = R(p, [w]a p ) < oo such 
that for all r G (R, oo] we have 



for some constant C depending only on w, p, r. 

The simpler endpoint case r = oo of Theorem 1.1 is the Walsh- Fourier analogue of a 
theorem of Hunt and Young [7] (cf. [5] for extensions to more generalized settings). For 
r < oo, the estimate (1.1) gives more quantitative information about the convergence rate of 
Walsh-Fourier series. 

Theorem 1.1 is a consequence of the following more general theorem: 

Theorem 1.2. Let 1 < p < oo and w G A q for some q G [l,p)- Then for r G (2q, oo] such 
that 1/r < 1/q— 1/p, it holds that 

M 



for some constant C depending only on w, p, q, r. 

To see how Theorem 1.2 implies Theorem 1.1, take 1 < p < oo and w G A p . Note that the 
A p condition is an open condition, so for some e > 0, there holds w G A p _ e (see for instance 
[13]), and then apply Theorem 1.2 for q = p — e. 

The Fourier case of Theorem 1.2, corresponding to w = 1 G A%, is a theorem of Oberlin- 
Seeger-Tao-Thiele- Wright [19] (cf. [18]). Using this result, one can see that the conclusion 
of Theorem 1.1 must depend upon w G A p . Suppose that there is a fixed < r < oo and 
1 < p < oo, for which (1.1) holds for all w G A p . Using Rubio de Francia's extrapolation 
theorem, we see that this same inequality would have to hold for w being Lebesgue measure 
and all 1 < p < oo. This contradicts the (Fourier) examples that are in [18, Section 2]. 

The proof of Theorem 1.2 uses two main ingredients: adaptation of phase plane analysis to 
weighted settings, and a weighted extension of a classical variational inequality of Lepingle 
(Lemma 6.1). The approach used in this paper is a weighted extension of the approach in 
[18, 19], and in particular it is different from the elegant approach of Hunt-Young [7], who 
use a good-A argument to upgrade the boundedness of the Carleson operator (the Fourier 
analogue of S) in the setting of Lebesgue measure to the settings of A p weights. A naive 
adaptation of the good-A approach does not apply to the variational estimates for Carleson's 
operator. 

We became interested in new approaches towards boundedness of Walsh-Fourier series 
in weighted settings while investigating questions related to weighted bounds for multilin- 
ear oscillatory operators, such as the bilinear Hilbert transform (whose boundedness in the 
Lebesgue setting is well-known from the work of Lacey and Thiele [9, 10]). To the knowledge 
of the authors, there hasn't been any adaptation of the Hunt-Young approach to the setting of 



(1.1) 




(1.2) 



sup 

M,N <-<N M 
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multilinear oscillatory operators. Standard approaches towards multilinear oscillatory oper- 
ators (started with Lacey-Thiele [10] and further developed by Muscalu-Tao-Thiele [14-17]) 
require detailed analysis on the phase plane, and this motivates us to consider a weighted 
adaptation of the time-frequency analysis framework. 

In this paper, we only consider analysis on the Walsh phase plane, which is certainly easier 
than the Fourier case, although there are qualitative similarities between the two phase 
planes. Extension of the argument in this paper to the Fourier setting is a nontrivial task. 
In the weighted setting, there is a lack of L 2 orthogonality for Walsh packets, therefore some 
changes are needed in the way one proves the so-called size lemma. In fact, we will use a 
sharp function estimate similar to an argument of Rubio de Francia in [21], one can view 
this as a substitute for the good-A argument of Hunt-Young in the phase-plane. Our proof 
of Theorem 1.2 requires a weighted extension of the Lepingle inequality for variation norms 
and this is proved in Lemma 6.1. 

2. Walsh functions and Walsh packets 

We recall standard properties of Walsh functions and Walsh packets below. A good refer- 
ence is [23]. The Walsh functions W (x), Wi(x), . . . are supported in [0, 1] and can be defined 
recursively by W (x) = l[o,i), and for even and odd integers, 

W 2n (x) = W n (2x)l [0i i } + W n (2x - 1)1^ n > 1 

W 2n+l (x) = W n {2x)l [Q x ) - W n {2x - 1)1(1^ n > . 

A dyadic rectangle in IR + x K + , with area one, is referred to as a tile. The Walsh packet 
associated with a tile 

p = [2 j m, 2 j {m + 1)) x [2" J n, 2~ j (n + 1)) = I p x cu p 

is an L 2 normalized function supported in the spatial interval I p := [2 J m, 2 J (m + 1)) and is 
defined by 

<f) p (x) = 2~ j/2 W n {2~ j x - m) . 

For two tiles pi and p 2 such that p\ R p 2 ^ 0, we say that p\ < p 2 if I Pl C I P2 . This 
clearly implies u Pl D u P2 , furthermore there is a close connection between the partial order 
and orthogonality. Two tiles pi and p 2 are not ordered under '<' if and only if the tiles do 
not intersect in the plane if and only if {4> P1 , <p P2 ) = 0. 

A dyadic rectangle of area 2 is referred to as a bitile. We will denote these as capital letters, 
like P. There is an analog of the partial order '<' on tiles for bitiles, and we will use the same 
notation for it. A bitile P can be divided into two tiles having separate frequency intervals, 
a lower tile denoted by Pi and an upper tile by P 2 . We say that Pi and P 2 are siblings. 

The following property of Walsh packets is standard and has been used implicitly in various 
work on analysis of the Walsh phase plane (cf. [16, 17]). We formulate this property below 
and sketch a proof for the convenience of the reader, and since we will use it several times. 

1 We would like to point out that Xiaochun Li [12] has some unpublished results about weighted estimates 
for the bilinear Hilbert transform. 
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Lemma 2.1. Suppose that two tiles p andp' are siblings and q is another tile such thatp < q. 
Let I be the common time interval of p and p' . Then there exist two constants c p>q and c p i >q 
such that 

<f> p {x) = c M l/(x)<^(x) 
(f) p '(x) = c p / i3 |J| 1/2 /i_f(x)^ ? (x) 
where hj is the Haar function associated with the dyadic interval I. 

Note that one can easily compute the absolute values of c PtQ and c p / tq : 



\Cp,q\ l^p'^l 



\ I q\ 
|/|i/2 



Sketch of proof. For the first property, by induction one can assume \I q \ = 2| I p \, in which case 
it follows from the recursive definition of W n (cf. [23]). For the second property, note also 
that by definition W2 n +i(x) = W2 n (%)h[o t i)(x), so after appropriate scaling and modulation, 
it is clear that there is some constant a G { — 1, 1} such that 

(j) p '{x) = a\I\ 1 ^ 2 hi(x)(j) p (x) . 

□ 

3. Discretization 
For any collection P of bitiles and any r £ [1, oo), let 

M , , 

l/r 



C r , P f(x) := sup ( V| V(/,0 Pl )0 Pl (x)l 



M,N <-<N M ■ j=1 pgp 



A symmetric variant of C r p can be obtained by using the limiting conditions {A^-i G 
wpu A/j ^ wp} in the above expression. 

In the rest of the paper we'll always assume that r < oo and r > 2q, and q > 1. These 
assumptions are without loss of generality. 

Via a standard argument (cf. [23]), Theorem 1.2 follows from the following theorem and 
its symmetric variant (whose proof is completely analogous). 

Theorem 3.1. There is a constant C = C(w,p,qr) > such that for any collection P of 
bitiles we have 

(3-1) \\C r ,pf\\LP(w) < C\\f\\ L v( w ) 

for all p G (q, oo) such that l/r > 1/q — 1/p. 

By duality (cf. [18]), it suffices to show (3.1) for the following linearized variant of C r; p 
(we'll omit the dependence on r for simplicity): 

M{x) 

(Cpf)(x) = ^(/^Pi)^!^) 1 ^^^)^, N^e^yajix) , 
j=i PeP 
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M(x) 

Yl \ a i( x )\ r ' = 1 • 
0=1 

In the following, we denote a P (x) = Y^f=x ^(^{JVj-iti)^, a^)^}- Let 

Bp(f,g) ■= ^(fiMifaap, gw). 
PeP 

Also, denote w(G) := j G w(x)dx for any set G. We say that K C G is a major subset of G 
if w(K) > w(G)/2 and we say it has full measure if w(K) = w(G). We'll show that 

Proposition 3.1. Let F and G be two sets with w(F), w(G) < oo. Then there exists F and 
G, major subsets of F and G respectively such that: 

(i) at least one of them has full measure, and 

(ii) for any \ f\ < lp and \g\ < lg and any collection of bitiles P we have 
(3.2) B P (f,g) < Cw(F) 1/p w(Gy~ 1/p 

for all p G (q, oo) such that 1/r > 1/q — 1/p. 

Via restricted weak-type interpolation, the above proposition implies 

B F (f,g) < C\\f\\ LP{w) \\g\\ LP ' {w) p> q 

where C depends only on p,q,r,w and there is no restriction on / or g, and this in turn implies 
Theorem 3.1. It remains to show Proposition 3.1. 



4. Decomposition of P 

To prove Proposition 3.1, as is now standard, P will be decomposed into more refined 
subcollections, so that the bilinear sum B associated with each such subcollection can be 
estimated more effectively. For this purpose, two standard measurements, size and density, 
are associated with each collection. In this section we formulate our weighted adaptations of 
these notions. 

To formulate size, we first recall the definition of trees. 

Definition 4.1 (Tree). Let Pt be a bitile. A tree T with tree top Pt is a finite collection of 
bitiles such that P < for any P G T. 

Writing Pt = It x wt, we will refer to It as the top interval of T. A tree is called 1- 
overlapping if the lower tile P\ of every P G T is less than the lower tile of the tree top. 
Similarly, a tree is 2-overlapping if every upper tile P2 is less than the upper tile of the tree 
top. Clearly any tree can be decomposed into two trees, one of each type. 

In the following, let S T f(x) := [Ep 6T |</,<M)| 2 fel ^ 
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Definition 4.2 (Size). The size of a collection of bitiles P is the best constant C such that: 
for any 2-overlapping tree T C P we have 



<Cw(I T )* 

L A {w) 



s T f 

We will denote the size of P by size(P). 

It is clear that for w = 1 one recovers the standard definition of size (cf. [9, 17]). 
Definition 4.3 (Density). The density of a collection P of bitiles is 

density(P) := sup sup ( — — — / \g(x)\ r ' \ak(x)\ r ' w(x)dx 

PePQ>p\w{I Q ) J j 



l/r' 

\g( x )\ r \a k (x)\ r w(x)dx) 

J Q k:N k (x)eco Q 

Since \g\ < la- it is clear that the density of any collection is bounded above by 1. 



4.1. Size bounds. In this section, we show the following bound, which is a variant of [17, 
Lemma 4.5]. 

Lemma 4.4. If w is in A q then 

"w(I P nF)y/q 



(4.1) size (P) < C sup ( 



p eP V W(lp) 

The proof of Lemma 4.4 relies on the following BMO characterization of size, which is a 
variant of [17, Lemma 4.2]. 

Lemma 4.5. For any collection P of bitiles and any 1 < p < oo we have 

( 4 - 2 ) SU P —prTTT:\\S T f\\LP(w) ~ P sup —^\\S T f\\ L i^ iw) 

tcp wylx) tcp w (I T ) 

the suprema are over 2-overlapping trees. 

Proof. Since St/ is supported in It, the right hand side in (4.2) is clearly bounded above by 
the left hand side. For the other direction, one can freely assume that P is finite. Denote 
the left hand side of (4.2) by a, which is now finite. 
Let T C P be a 2-overlapping tree such that 

(4-3) \\STf\\L^ w )>^(lS. 

We will show that the L 1,00 (w) norm of StJ, tested at height A ~ a, dominates w(It)o~. For 
any dyadic interval /, by definition of a we have 

PeTjpd 11 

Note that the integrand on the left hand side is supported in /. Now, fix A > and let 

T:={PeT:I P C {S T f > A}} . 
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By dividing {St/ > A} into maximal dyadic components and applying the last estimate for 
each such interval, after summing we obtain 

(4.4) \\(S f fr\\ LHw) <aM{S T f>\}) 

On the other hand, it is not hard to see that HS^f/Hoo < A. Indeed, one only needs to show 
that for any maximal dyadic component I of {St/ > A} and any x G / we have 

S T \ff( x ) < A • 

Let J be the dyadic parent of /. By definition of T, one can write 

S nf /(x) = S Tj f(x) , Tj:={PeT:JcI P } . 
Clearly StjJ is constant on J and J has nontrivial intersection with {St/ < A}. Therefore 

S Tj f(x) = mi S T J(y) < MS T f(y) < A . 

Since St/ is supported inside It, using (4.3) and (4.4) and the above L°° bound, we have 

^rjr^ < \\(S T fT\\m w ) < C\ p w{I T ) + Ca p w({S T f > A}) . 
Letting A = a jC for some large C, we obtain the desired estimate: for some c > 0, 

caw (I T ) < Xw({S T f > A}) < ||Sr/|Ui.°°(t<j) ■ 

□ 

Proof of Lemma 4-4 using Lemma 4-5. By Lemma 4.5, it suffices to show 

-w(I P nF)\V<i 



\S T (f)\\L<( w) <Cw(I T ) 1/q sup (- 

PezT V 



PgT V W(I P ) 

for each 2-overlapping tree T. One can assume that T contains its top element, in which case 
we will show: 

\\s T (f)\\ L « iw] <Cw(iTnF)^ . 

Let (ep) p s t be a random sequence of 1 and —1, then it suffices to show the following uniform 
estimate (over e): 

ll^e/IUdfa) < C||/||L8(w) 

Tef ■= ^e P (/,0 Pl )0 Pl . 

By the 2-overlapping property of T , by Lemma 2.1, we can rewrite T t as 

(TJ)(x) = \I T \ ^2e P (f(j)p T ,h Ip )h Ip (x)(j)p T (x) 
Per 

where (pp T is the Walsh packet associated to the upper tile of the top of the tree. Therefore 
the desired bound for T e follows from standard properties of the martingale transform (cf. 
[24]). □ 
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4.2. Tree selection by size. The decomposition of the collection P is done via selection 
of trees of comparable size and density. The following Lemma allows for selection of trees 
based on size. Recall that 1 < q < oo, and w G A g . 

Lemma 4.6. Let P be a collection of bitiles with a = size(P) < oo. Then there exists a 
subcollection P'cP with 

size(P') < a/2 

such that P \ P' can be written as a union of trees, P \ P' = IJtgt with 

w ^t) < Ca~ 2q w(F) . 

TeT 

The constant C depends upon q and [w}a q - 

This proof, especially the appeal to the sharp function below, is much easier to complete in 
the Walsh setting. We note that the usual approach (cf. [10]) relies on some orthogonality 
of the packets in L 2 , and this is not necessarily true for non-Lebesgue weights w. Our proof 
strategy for Lemma 4.6 is derived from Rubio de Francia's argument [21]. 

Proof. By the standard selection algorithm (cf. [10] or [17] which is closer to the dyadic 
setting of this paper), one can find a collection of trees T such that the following conditions 
hold. Each T G T contains a 2-overlapping tree T2 such that 

\\StJ\\l*( w) > |w(/t)' TeT 

size(P \ (J T) < a/2 . 
TeT 

Furthermore, the selection algorithm ensures that the tiles in the collection D := {Pi : P G 
|J TgT T 2 } are pairwise disjoint tiles in the phase plane. 

It remains to bound the sum over T G T of w(ir)'s. Using Holder's inequality, we have 

\\S t J\\l^ w ) > |w(ir)^ 

therefore 

^(I T )<Ca- 2 4j2(ST 2 f) 2q 

W* L 1 lw) 

TeT TeT 

g 

lr_\V2 



<C7a-^||(^(5 T J) 
TeT 

= Ca-M\(j2\(f,<P P )\ 2 ^ 



peD 



2q 

L 2 i(w) 



Let Sd/ denote the square sum inside the last L 2q norm. We will show 
(4.5) (SW) S < CM 2 f 
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where the left hand side is the dyadic sharp maximal function of S-of, and 



M 2 f(x) : = sup(^ / \f\x)\ 2 dx 
i-.xei \ | J | J i 



1/2 



Since q > 1 and it) G A 9 , (4.5) implies the desired estimate: 

< C||M 2 /||^ (u;) < Cll/H^ < C W (,P) . 

Note that we are appealing to Hg'H^^) < C^ibII^IIi^w), and in the second inequality we 
used boundedness of the maximal function on L q (w). It remains to show (4.5). 

Take a dyadic interval J and x G J. In the definition of the sharp maximal function, we 
are permitted to subtract off a constant, and we will take that constant to be 

K/,^>IV 2 



Cj 



E 

peD-.Jcip 



Then via Holder's inequality, we have 



X, -J lSoif) - cj\dy 



< ^-JjSvif) 2 ~ c 2 j\dy 
<4r£KA/>^>l 2 

1 1 P eD 

< J\f(y)\ 2 dy < M\f\ 2 (x) 



□ 



This proves (4.5). 

We shall also need the following result (cf. [18, Proposition 4.3]). 
Lemma 4.7. The collection of trees selected in Lemma 4-6 also satisfies for any p G (1, oo): 



(4.6) 



IE' 



I T 



< Ca~ 2q w(F) 



TGT 



LP(w) 

and ifP = {J SeS & ^ s an other tree decomposition of P then 
(4.7) 5>(/ T )<C<5>(/ 5 ) . 

TGT ses 

The last condition quantifies an efficient aspect of the tree selection algorithm. 

Proof. We first prove (4.6). Let M w be the weighted maximal function. 

1 



M w f(x) = sup 

l :xe l W(I) 



\f(y)Hd y ) 
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Let N = J2t&t then it suffices to show the good A inequality 
(4.8) w({N > A, M w l F < ca 2q X}) < ^w{{N > A / 2 » 

for some small absolute constant c > 0. Indeed, it follows from (4.8) that 

POO 

\\N\\l Hw) = / P^'wttN > \})d\ 

/•oo 

< C / pX^wdM^F > ca 2q X})d\ 
Jo 

= Ca- 2 n\M w l F \\ p LP{w) < Ca~^\\l F \\ p LP(w) = Ca- 2 ^w(F) ■ 

To prove (4.8), decompose {iV > A/2} into maximal dyadic intervals, and it suffices to 
show that for any such maximal / with nontrivial intersection with {M w lp < ca 2q X} we have 

w({x G I : N(x) > A}) < ^w(I) . 
Let T/ = {T 6 T : I r C /}. Then the argument in Lemma 4.6 applied to flj gives 



V w(I T ) < Ca~ 2q w(F fl I) < C7w(/)a- 29 inf(M UJ l F )(x) 
r eTj 

< Cw(I)a~ 2q c 2q a 2q X < Aw;(/)/4000 



if c is chosen sufficiently small. 

Consequently, for Ni := J^TeT/ we have 

w({Nj > A/4}) < 4A" 1 ||iV / || L1(u , ) < |/|/1000 . 

Now, N — Ni is constant on the parent ir(I) of /, is dominated by inf xe7r (/) N(x), which in 
turn is less than A/2 by maximality of /. Thus 

{x G I : N(x) > A} C {x G / : N^x) > A/4} 

and (4.8) follows. 

Now we'll show (4.7). By the selection algorithm, we have 

5>(/ r ) < Ca- 2 ]T / \(f,M\ 2 ^-Mx)dx 
PeD ' p ' 



where .D := T 2 . 



We'll show that 



pgd I p ' ses 
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and that will complete the proof of (4.7). 

Now, for any tree S £ S we can decompose S fl D into two trees S± and 5*2 with the same 
top interval, where Si is 1-overlapping and S2 is 2-overlapping. Clearly Uses^ 1 ^ $2) = D. 
By given assumption, we have 

o > size(S 2 ) ~ -r^i I £ \(fAp t )\ 2 ^-Mx)dxfl 2 

\ S) J DrB I -PI 



Pes 2 
therefore 

(4.9) EK/'^)l 2 irr^^ 2 ^) • 

Pe5 2 1 1 

On the other hand, the selection algorithm ensures that the 1-tile of any two elements of D 
are disjoint. Therefore each Si contains only spatially disjoint elements. If P £ Si then 



W(/P) 



l</,«r-rr^<[8i«({^})r<^ 



so we obtain 



(4.10) E K/^a)! 2 ^ < Ca 2 £ «,(7 P ) < cV^s) . 

PeSi ' p ' pgSi 

Summing over S £ S of (4.10) and (4.9) we obtain the desired estimate. □ 

4.3. Tree selection by density. The proof of the next Lemma follows from standard ar- 
guments, we omit details (cf. [10]). 

Lemma 4.8. Let A > and let P be a collection of bitiles. Then there is an P' C P with 
size(P') < A/2 such that P \ P' can be written as a union of trees P \ P = IJtgt T w ^ 

w ^t) < C\- r 'w(G) . 

TeT 

5. The tree estimate 
The estimates of the bilinear sums Bp are based on the following estimate: 
Lemma 5.1. Let T be a tree, then for any s £ [l,r'] we have 
(5.1) \\gC T f\\ L s {w) < Cw{TY' s size(T) density(T) . 

Proof. By Holder's inequality it suffices to show (5.1) for s = r' . By dividing T into two 
subtrees, if necessary, we can assume that the tree is either 1-overlapping or 2-overlapping. 
We will return to this dichotomy below. 

Let J be the set of maximal dyadic intervals inside It that does not contain any Ip for 
P £ T . This collection partitions It, and we rewrite the left hand side of (5.1) as 

\{C T f){x)g{x)\ r 'w{x)dx\ 




JeJ 
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Fix J G J. By maximality of J, there is some Pj G T such that Ip ; C 7r(7), where 7r(J) is 
the dyadic parent of J. It is clear that there is a bitile Qj such that 

Pj<Qj I Qj =7t(J) . 

In particular, wq j D wp T 7^ 0. On the other hand, again by maximality of J, for any P G T 
such that I P H J 7^ we have J ^ /p. Consequently, % C wq 7 for those P's, thus 

(5.2) |J up 2 C w Qj . 

PeT:/ P nJ^0 

Furthermore, it is clear that 

(5.3) J \a k (x)g{x)\ r 'w{x)dx < (J) density (T) r ' . 



fc:Af fe (x)e^ Q 



,7 



Here, the constant C depends upon the doubling property of to, which is controlled by [w]A q - 
Case 1: T is 1-overlapping. Then the tiles {P2 : P G T} are disjoint. Then by monotonicity 
of NfrS, for any x there is at most one P G T such that there is a k G [l,M(x)] satisfying 
both (x, Nk(x)) G P2 and (x, A r / C _i(x)) ^ P. Clearly, such k if exists is unique. Consequently, 
using (5.2) and (5.3) we have 

[ \(C T f)(x)g(x)\ r 'w(x)dx^' 



JeJ 



< 



C sup ■ -j^ ■ ( / sup |a fe (x)l { 7v fc(:c ) etJ }5f(x)| r w;(x)rfx 
Per |Jp| v j^jJj k 



l/r' 



<^-Pl7wdensit y (T)(^,(J)) 1/r ' 

peT i/pI 1 / 2 ; 

= C7sup (— l — I [|(/,0 Pl )| 2 ^l W ;(x)dx) 1 density(T)«;(J T ) 1 /'-' 
PeT Vw(Jp) J L |/p|J / 

< Cw(I T ) l,r ' size (T) density (T) . 
Case 2: T is 2-overlapping. From Lemma 2.1, it follows that we can write 

( /« ^ ) 0p, = 0p t I /t 1 1/2 ep ( /, <M ) ^ 
PeT PeT 

here ep = ±ep and the sign depends on the sign of the implicit constant in the application 
of Lemma 2.1. Also, 0p T is the Walsh packet associated to the upper tile of the top of the 
tree, so that ||0p t |/t| 1/,2 ||oo = 1- in particular, we can ignore this term in the considerations 
below. For convenience, below we denote y?p = '52 PeT €'p(f,(j)p 1 )hi p . 

Now, for convenience denote by Aj the projection of a function onto the space generated 
by Haar functions adapted to dyadic intervals of length 2 1- - 7 . The function tp?, being a linear 
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combination of Haar functions, satisfies the familiar identity below, for any dyadic interval 
K: 

7p(f,cj)p 1 )hi p = A„ log2 \k\(<Pt) j an d is locally constant on K. 

PeT 

K<ZI P 

Now, since T is a tree, the intervals ujp for P G T are clearly nested. Furthermore, the 
2-overlapping property of T means that the intervals up 2 for PeT are also nested. Hence, 
if AT € Up 2 , then N G Up^ for all tiles P' G T with \Ip>\ < |ip|, and if iV ^ Up then TV G^ Upi 
for all P' G T with |/p/| > |ip|. Combining these observations, for any J G J we can find 
measurable functions defined on J 

tm(x) < £ M (x) < •■< n(x) < ^x(x) < 7-0(2;) < Co(^) < - log 2 1 J\ 

such that for any x G J: 

M(x) 



£ 

fc=i 



X] ( PPi) h Ip( X ) a k(x)l{N k _. 1 (x)^ P ,N k (x)eu 1P2 } 
PeT:JCI P 

= E| E ( A jfT)(x) \a k (x)\ 

k r k (x)<j<£ k (x) 

^ (Ei E (A^T)(x)i r ) 1/r (^ia fc (x)r') 1/r ' 

fe r fe (a;)<j<^ fe (x) ft 



= [\J\ / (El E (A^T)(y)r) 1/r rfy] [E 

' ' Jj k T k {x)<j<i k (x) k 

<Mj(y T \\ V r)(J2Mx)\ r ') 1/r ' ; 

k 

where ||<£>t||v := SU P ( I E/ (AjVt)| 



,1 



m,no<-<n m ■ , 

Thus, using (5.2) and (5.3), we obtain 

||CV(/)s|| Lr>) < L7(^M J (^T) r '^(J)density(T)'-') 1/r ' 

JeJ 

<C||l, T M(^ T )||^ H density(T) 

< Cw(7 T )^-^||M(||^T|kOIU^Hdensity(T) (using 2q > 2 > r') 

< Cu;(/ T )^"^||||( / 9 T ||y||L 2 'JW densit y( T ) ( usin g w E A q C A 2q ) 

< Cw(I T y~^ ||<Pt|U 2 «M density (T) . 

The last inequality depends upon the weighted variant of an inequality of Lepingle taken up 
in the next section. 
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Now, by a standard duality argument and boundedness of the dyadic square function on 
L 2 i(w) (cf. [24]) one has 

||^t||z, 2 <z(V) < C\\S((Pt)\\l 2 i(w) 
where S(g) := \{g, hj)] 2 -^) 1 / 2 , and the constant above depends upon q and w. Using 
Lemma 2.1 and the fact that T is a tree, we obtain S(ipT) = Srf- Lemma 5.1 now follows, 
using the BMO characterization of size proved in Lemma 4.5. □ 

6. A WEIGHTED LEPINGLE INEQUALITY 

For each i let Aj be the projection onto the space of Haar functions adapted to dyadic 
intervals of length 2 1-4 : 

A*/= (fMh ■ 

J:|/|=2 1 - i 

In this section we prove the following extension of an inequality of Lepingle [11] (cf. [1,8, 20]). 

Lemma 6.1. Let l<p<oo,w^A p and r > 2. Then for any function f we have 

M 1/ • 

(6-1) || sup (V| J2 A i/I r ) < C\\f\\ L p H ■ 

Furthermore, the following endpoint estimate holds uniformly over A > 0: 
(6.2) ||AM A 1/2 || LPW <L7||/|| iPW , 

M x {x) := sup %{k : | J2 A if\ > A > • 

M,N <N 1 <-<N M N k ^<j<N k 

The considerations in the proof are of a standard nature. 

Proof. We first show that (6.2) implies (6.1) using an argument in [3] (cf. [1]). By standard 
arguments, we can remove the supremum in the estimates and assume instead that M, N < 
■ ■ ■ < Nm are measurable functions of x. It suffices to show that if w G A p then 

w({V r f>\})<C\- p \\f\\ P LP(w) , 

from this the desired strong bound follows from interpolation (exploiting the reverse Holder 
property and the nesting property of A p classes). Via scaling invariant, one can assume 
||/||lp(io) = 1, and let a k denote ^Ca^^xa^ ^jf- Tnen on tne set 

E = {x : sup |afc(x)| > A} 

k 

one has M\{x) = §{k : \a k \ > A} > 1, thus using (6.2) one has 



(6.3) w(E) < J M x {xf/ 2 w{x) < C\-*>\\f\\l 



LP(w) - C ^ 
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On E c , for any e > one has 

\V r f{x)Y' 2 < L7(^(2"A) r M 2 , A (x)) 1/2 

n<0 

< C^2 re(1 - £)r/2 A r/2 M 2 n A (x) 1/2 anye>0 . 

n<0 

By triangle inequality, it follows that 
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W 



({v r f>\}nE c )<\~p r / 2 \\i EC (v r f) 



r/2\\P 



\LP(w) 
P 



| LP (to) 



< C\~ pr / 2 ( J2 2 nil ~ e)r/2 X r/2 \\M 1 2 ill 

n<0 

< C , A" pr / 2 (^2"( 1 -^/ 2 A r / 2 (2"A)^ 1 ||/|| LP(u , ) ) P (using (6.2)) 



n<0 



Choosing e > small one can ensure that (1 — e)r/2 > 1. It follows that 



(6.4) 



({v r / > A} n e c ) < c\~ pr/2 [\ r/2 x~ l 



LP(w) 



cx 



Lp(w) 



< c\- 1 



LP(w) > 



LP(w) 



<C\\f\\ 



Lp(w) j 



The desired estimate now follows from (6.3) and (6.4). 

We now show (6.2). Fix A > 0. It suffices to show that for N (x) < Ni(x) < . . . we have 

'(IK*: | Yl (A,/)(x)|>A}) 1/2 

Nk-l(x)<j<N k (x) 

furthermore by a standard argument (see for instance [1] or [8]) one can assume that Nk(x) 
are stopping times with respect to the dyadic martingale in R. Here, a function N(x) is a 
stopping time if the level set {x : N(x) = k} is an union of standard dyadic intervals of 
length 2~ h . With this assumption, we'll show the following stronger estimate 

j: (^/)wi 2 ) 1/2 

k JV Js ._i(x)<j<JV i ,(a5) 

and by randomization it suffices to prove for any random sequence e& = ±1: 
(6-5) ||5> £ (A,/)(x) <C7||/|| WH . 

Take any fc > 1. Let T<& be the set of dyadic intervals J such that 

(i) iV^ is constant on /, and 

(ii) for any x E I the interval I has length at most 2~ Nk ^ x \ 

By the stopping time property of and by the increasing property of A^'s, it is clear that 
T<fc C T<fc_i, and define 

Tfc = T< fc _x \ T< fc . 

One now writes 

(A,f)(x)=J2(f,h I )h I (x) 

N k -i(p)<j<N k (x) IeT k 
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and (6.5) follows from boundedness of the martingale transform in the A p setting (cf. [24]). 

□ 

7. Proof of Proposition 3.1 

Without loss of generality assume w(F) > and w(G) > and furthermore max(w(F),w(G)) = 
1. The major subsets will be defined using the weighted dyadic maximal function 

M w f(x):= sup w(I)- 1 f\f\w(dx). 

I : xei J I 

M w bounded from L 1,00 (w) to L 1 (w), for any weight, with norm 1. 

Case 1: w(F) < w(G) . It follows that w(G) = max(w(F),w(G)) = 1. We define F = F and 

G :=G\{M W 1 F > Cw(F)\ 

for some large constant C. Assume without loss of generality that Ip fl F ^ where \P e P. 
Thus, by Lemma 4.4 we have 

(7.1) a := size(P) < C min(l, w(F) 1/q ) . 

Let r = w(F) l l 2q . By recursive applications of Lemma 4.6 and Lemma 4.8, we can divide 
P = Unez P« sucn ^ na ^ = Utgt„ ^ * s an un i° n °f trees satisfying: 

TeT n 

size(P n ) < Cmin(cr,2- n /( 29 V) , 
density(P n ) < Cmin(l, 2" n/r ') . 
Applying the tree estimate (5.1) (with s = 1), we have 

B P (f,g) <CJ2 E ^(^)si Z e(T)density(T) 

< CJ2 T vMp, T^r) min(l, 2~ n/r ') 
We show that for any 2q/r < rj < 1 we have 

(7.2) E 2 " min 2~ n/(29) r) min (l, 2~ n/r ') < C^V . 

n 

This will imply the desired bound (3.2) for Bp(f,g), as one can select 77 very close to 2q/r 
and use (7.1) to obtain 

B P (f,gl~) < Ca 1 '^ < Cw{F)^ q ^ 1 ~' n ^ < Cw{F)^ p = Cw{F) l ' p w(G) llv ' 

for any p such that - < - — -. Here, we used the assumption that w(F) < 1. 
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It remains to show (7.2). Take any a,/3 £ [0, 1], we estimate the left hand side of (7.2) by 

< Ca 2" min (l, 2'^^-^° min (l, 2- n/r '^ 

< Ca^2 n min (l, 2 - an ^2-^ r ' 

n 

The condition r > 2q ensures that there exists a, /3 £ [0,1] satisfying 

7.3) — + - > 1 . 

v ; 2q t> 

If a, /3 are such, the last estimate is a two sided geometric series, so is controlled by the 
largest term, which is about the size of 



Gen — I = Off 't 1 rj :- 



.a) (3/(r') + a/(2q)- 

Varying a, f3 in [0, 1] respecting (7.3), one can get any rj £ (^, 1). 

Case 2: w(F) > w(G). It follows that w(F) = max(w(F), w(G)) = 1. We choose G = G and 

F = F\{m„(1 g )>C%(G)} 

for some large constant C. It follows that 

density(P) < Cw{G) 1/r ' . 

while clearly size(P) < C. By recursive applications of Lemma 4.6 and Lemma 4.8 we 
decompose P = IJnez ^ n sucn ^ na ^ = Utstv ^ is a union of trees satisfying 

TeT„ 

size(P n ) < C2~^ 
density(P n ) < L72- n/r 'u>(G) 1/r ' . 

We now use Lemma 4.7 and decompose P n into IJfc>o ^*n,k such that each ~P n ^ = Utgt fc 
is a union of trees, with 

size(P n , fc ) < C2~ {n+k)l{2q) , 
II l lT \\ LPM <C2 n+k w(F) 1 /v = C2 n+k , 



£ 1/tIIlv-) <C £ w(/ r ) <2" . 



TeT n ,. T£T r , 
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By interpolation of the last two estimates (use p large in the first), we obtain 
(7.4) 



E 1 It\\lp-*(w) < C2 k/p '2 n 



It follows that 



5p„, t (/,9) = / E 1 i T ^2(f^Pi) ( l ) Pi a p( x )9(x)w(x)dx 

J TgT n , fe P&T 

c l{ E h -) 1/r { E iE^^w^)ry /r ^)^ 



< 

TeT n , fc 

For p very large we estimate this by 

l/r 



TeT n . k Per 



(7.5) <C ( £ l /T ) Vr ( i^a^p^^ap^M^r') 



r'\ 1 / r ' 



i>-<0'(u;) 



reT„ it Per 

We'll choose p very large such that p — e > r. Since the function inside the L^ p ~ e ^' (w) norm 
is supported in G, by Holder's inequality we can estimate the second factor by 

< w(G) 1/(p_e) ' _1/r '|| ( \ Y,(f><t>K)Map(x)g(x)\ r ') 1/r ' 
T&T„. k Per 



w(G) 



l/(p-e)'-l/r 



'( E II E (fiMfaapWgfa) 



L r ' (to) 
l/r' 



\L r '(w) 



reT n . fc Per 



and using the tree estimate (5.1) we can estimate the above expression by 
<w(G) 1/(p - e) '- 1/r, ( ™( J T)) Vr size(P n „ fc )density(P n , fc ) 

< u; (G') 1 /(p-^-iA'2«A'2-(«+ fc )/( 2 9) mm (2- n/r ' w(G) 1/r \ density(P)) 
Since density(P) < Cw(G) 1 ^ r ' , the above expression is controlled by 

< C7w(G) 1/(p - e) '2- (n+fe)/(29) min(l,2 n/r ') . 
Using (7.4), we obtain an estimate for the first factor in (7.5): 



( E ^) 



l/r 



TgT„ 



LP- e (u;) 



l/r 

L(P- E )/ r (to) 



< (72 n / r 2'- 1 / r_1/ ' p ) fc 

Therefore 

B Prhk (f,g) < Cw(G) 1/(p - e) '2 n/r 2 (1/r - 1/p)fc 2- (n+fc)/(29) min(l,2 n/r ') . 
Note that r > 2q by given assumption, so we always have 



1 1 1 
- < - + — 

r p 2q 
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1!) 



Then summing over k > 0, we obtain 

Bp n (f,9) < Cw(G) 1/{p -' y 2 n{1/r - 1/{2q)) min(l, 2 n/r ') . 
Finally, summing over nGZwe obtain 

B P (f, g) < Cw(G) 1/{p ~ ey 2< 1 /r-i/(2<z)) min(l, 2 n ' r ') 

n&L 

and this is a two-sided geometric series and it converges since 1 jr — 1/ (2q) < and 1/r + 
\/r' — l/(2q) > 0. Thus, the series is dominated by its largest term, which is about the size 
of 

Since < w(G) < 1 and since we can choose p < oo arbitrarily large, it follows that for any 
finite p, 

B P (f,g) < Cw(G) 1/p ' = Cw(F) l/p w(G) 1/p ' , 
and this completes the proof of Proposition 3.1. 
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